22 research outputs found

    Revisiting Data Complexity Metrics Based on Morphology for Overlap and Imbalance: Snapshot, New Overlap Number of Balls Metrics and Singular Problems Prospect

    Full text link
    Data Science and Machine Learning have become fundamental assets for companies and research institutions alike. As one of its fields, supervised classification allows for class prediction of new samples, learning from given training data. However, some properties can cause datasets to be problematic to classify. In order to evaluate a dataset a priori, data complexity metrics have been used extensively. They provide information regarding different intrinsic characteristics of the data, which serve to evaluate classifier compatibility and a course of action that improves performance. However, most complexity metrics focus on just one characteristic of the data, which can be insufficient to properly evaluate the dataset towards the classifiers' performance. In fact, class overlap, a very detrimental feature for the classification process (especially when imbalance among class labels is also present) is hard to assess. This research work focuses on revisiting complexity metrics based on data morphology. In accordance to their nature, the premise is that they provide both good estimates for class overlap, and great correlations with the classification performance. For that purpose, a novel family of metrics have been developed. Being based on ball coverage by classes, they are named after Overlap Number of Balls. Finally, some prospects for the adaptation of the former family of metrics to singular (more complex) problems are discussed.Comment: 23 pages, 9 figures, preprin

    mldr.resampling: Efficient Reference Implementations of Multilabel Resampling Algorithms

    Full text link
    Resampling algorithms are a useful approach to deal with imbalanced learning in multilabel scenarios. These methods have to deal with singularities in the multilabel data, such as the occurrence of frequent and infrequent labels in the same instance. Implementations of these methods are sometimes limited to the pseudocode provided by their authors in a paper. This Original Software Publication presents mldr.resampling, a software package that provides reference implementations for eleven multilabel resampling methods, with an emphasis on efficiency since these algorithms are usually time-consuming

    Nuevas arquitecturas hardware de procesamiento de alto rendimiento para aprendizaje profundo

    Get PDF
    El diseño y fabricación de hardware es costoso, tanto en tiempo como en inversión económica, razón por la que los circuitos integrados se fabrican siempre en gran volumen, para aprovechar la economía de escala. Por esa razón la mayoría de procesadores fabricados son de propósito general, ampliando así su campo de aplicaciones. En los últimos años, sin embargo, cada vez se fabrican más procesadores para aplicaciones específicas, entre ellos aquellos destinados a acelerar el trabajo con redes neuronales profundas. Este artículo introduce la necesidad de este tipo de hardware especializado, describiendo su finalidad, funcionamiento e implementaciones actuales.The design and manufacture of hardware is expensive, both in time and in economic investment, which is why integrated circuits are always manufactured in large volume, to take advantage of economies of scale. For this reason, the majority of processors manufactured are general purpose, thus expanding its range of applications. In recent years, however, more and more processors are being manufactured for specific applications, including those aimed at accelerating work with deep neural networks. This article introduces the need for this type of specialized hardware, describing its purpose, operation and current implementations.Universidad de Granada: Departamento de Arquitectura y Tecnología de Computadore

    COVIDGR Dataset and COVID-SDNet Methodology for Predicting COVID-19 Based on Chest X-Ray Images

    Get PDF
    Currently, Coronavirus disease (COVID-19), one of the most infectious diseases in the 21st century, is diagnosed using RT-PCR testing, CT scans and/or Chest X-Ray (CXR) images. CT (Computed Tomography) scanners and RT-PCR testing are not available in most medical centers and hence in many cases CXR images become the most time/cost effective tool for assisting clinicians in making decisions. Deep learning neural networks have a great potential for building COVID-19 triage systems and detecting COVID-19 patients, especially patients with low severity. Unfortunately, current databases do not allow building such systems as they are highly heterogeneous and biased towards severe cases. This article is three-fold: (i) we demystify the high sensitivities achieved by most recent COVID-19 classification models, (ii) under a close collaboration with Hospital Universitario Clínico San Cecilio, Granada, Spain, we built COVIDGR-1.0, a homogeneous and balanced database that includes all levels of severity, from normal with Positive RT-PCR, Mild, Moderate to Severe. COVIDGR-1.0 contains 426 positive and 426 negative PA (PosteroAnterior) CXR views and (iii) we propose COVID Smart Data based Network (COVID-SDNet) methodology for improving the generalization capacity of COVID-classification models. Our approach reaches good and stable results with an accuracy of 97.72%±0.95% , 86.90%±3.20% , 61.80%±5.49% in severe, moderate and mild COVID-19 severity levels. Our approach could help in the early detection of COVID-19. COVIDGR-1.0 along with the severity level labels are available to the scientific community through this link https://dasci.es/es/transferencia/open-data/covidgr/This work was supported by the project DeepSCOP-Ayudas Fundación BBVA a Equipos de Investigación Científica en Big Data 2018, COVID19_RX-Ayudas Fundación BBVA a Equipos de Investigación Científica SARS-CoV-2 y COVID-19 2020, and the Spanish Ministry of Science and Technology under the project TIN2017-89517-P. S. Tabik was supported by the Ramon y Cajal Programme (RYC-2015-18136). A. Gómez-Ríos was supported by the FPU Programme FPU16/04765. D. Charte was supported by the FPU Programme FPU17/04069. J. Suárez was supported by the FPU Programme FPU18/05989. E.G was supported by the European Research Council (ERC Grant agreement 647038 [BIODESERT])

    Artificial intelligence within the interplay between natural and artificial computation:Advances in data science, trends and applications

    Get PDF
    Artificial intelligence and all its supporting tools, e.g. machine and deep learning in computational intelligence-based systems, are rebuilding our society (economy, education, life-style, etc.) and promising a new era for the social welfare state. In this paper we summarize recent advances in data science and artificial intelligence within the interplay between natural and artificial computation. A review of recent works published in the latter field and the state the art are summarized in a comprehensive and self-contained way to provide a baseline framework for the international community in artificial intelligence. Moreover, this paper aims to provide a complete analysis and some relevant discussions of the current trends and insights within several theoretical and application fields covered in the essay, from theoretical models in artificial intelligence and machine learning to the most prospective applications in robotics, neuroscience, brain computer interfaces, medicine and society, in general.BMS - Pfizer(U01 AG024904). Spanish Ministry of Science, projects: TIN2017-85827-P, RTI2018-098913-B-I00, PSI2015-65848-R, PGC2018-098813-B-C31, PGC2018-098813-B-C32, RTI2018-101114-B-I, TIN2017-90135-R, RTI2018-098743-B-I00 and RTI2018-094645-B-I00; the FPU program (FPU15/06512, FPU17/04154) and Juan de la Cierva (FJCI-2017–33022). Autonomous Government of Andalusia (Spain) projects: UMA18-FEDERJA-084. Consellería de Cultura, Educación e Ordenación Universitaria of Galicia: ED431C2017/12, accreditation 2016–2019, ED431G/08, ED431C2018/29, Comunidad de Madrid, Y2018/EMT-5062 and grant ED431F2018/02. PPMI – a public – private partnership – is funded by The Michael J. Fox Foundation for Parkinson’s Research and funding partners, including Abbott, Biogen Idec, F. Hoffman-La Roche Ltd., GE Healthcare, Genentech and Pfizer Inc
    corecore